The muHVT package is a collection of R functions to facilitate building topology preserving maps for rich multivariate data. Tending towards a big data preponderance, a large number of rows. A collection of R functions for this typical workflow is organized below :
Data Compression: Vector quantization (VQ), HVQ (hierarchical vector quantization) using means or medians. This step compresses the rows (long data frame) using a compression objective
Data Projection: Dimension projection of the compressed cells to 1D,2D and 3D with the Sammons Non-linear Algorithm. This step creates topology preserving map coordinates into the desired output dimension
Tessellation: Create cells required for object visualization using the Voronoi Tessellation method, package includes heatmap plots for hierarchical Voronoi tessellations (HVT). This step enables data insights, visualization, and interaction with the topology preserving map. Useful for semi-supervised tasks
Prediction: Scoring new data sets and recording their assignment using the map objects from the above steps, in a sequence of maps if required
06th December, 2022
This package now additionally provides functionality to predict based on a set of maps to monitor entities over time.
The creation of a predictive set of maps involves four steps -
Let us try to understand the steps with the help of the diagram below -
Figure 1: Flow diagram for predicting based on a set of maps using mlayerHVT()
Initially, the raw data is passed, and a highly compressed Map A is
constructed using the HVT function. The
output of this function will be hierarchically arranged vector quantized
data that is used to identify the outlier cells in the dataset using the
number of data points within each cell and the z-scores for each
cell.
The identified outlier cell(s) is then passed to the
removeOutliers function along with Map A.
This function removes the identified outlier cell(s) from the dataset
and stores them in Map B as shown in the diagram. The final output of
this function is a list of two items - a newly constructed map (Map B),
and a subset of the dataset without outlier cell(s).
The plotCells function plots the
Voronoi tessellations for the compressed map (Map A) and highlights the
identified outlier cell(s) in red on the plot. The function requires the
identified outlier cell(s) number and the compressed map (Map A) as
input in order to plot the tessellations map and highlight those outlier
cells on it.
The dataset without outlier(s) gotten as an output from the
removeOutliers function is then passed as an argument to the
HVT function with other parameters such as
n_cells, quant.error, depth, etc. to construct another map (Map C).
Finally, all the constructed maps are passed to the
mlayerHVT function along with the test
dataset on which the function will predict/score for finding which map
and what cell each test record gets assigned to.
For detailed information on the above functions, refer the vignette here.
library(devtools)
devtools::install_github(repo = "Mu-Sigma/muHVT", ref = "dev")This package can perform vector quantization using the following algorithms -
The second and third steps are iterated until a predefined number of iterations is reached or the clusters converge. The runtime for the algorithm is O(n).
The second and third steps are iterated until a predefined number of iterations is reached or the clusters converge. The runtime for the algorithm is O(k * (n-k)^2) .
The algorithm divides the dataset recursively into cells using \(k-means\) or \(k-medoids\) algorithm. The maximum number of subsets are decided by setting \(n_cells\) to, say five, in order to divide the dataset into maximum of five subsets. These five subsets are further divided into five subsets(or less), resulting in a total of twenty five (5*5) subsets. The recursion terminates when the cells either contain less than three data point or a stop criterion is reached. In this case, the stop criterion is set to when the cell error exceeds the quantization threshold.
The steps for this method are as follows :
The stop criterion is when the quantization error of a cell satisfies one of the below conditions
The quantization error for a cell is defined as follows :
\[QE = \max_i(||A-F_i||_{p})\]
where
Let us try to understand quantization error with an example.
Figure 2: The Voronoi tessellation for level 1 shown for the 5 cells with the points overlayed
An example of a 2 dimensional VQ is shown above.
In the above image, we can see 5 cells with each cell containing a certain number of points. The centroid for each cell is shown in blue. These centroids are also known as codewords since they represent all the points in that cell. The set of all codewords is called a codebook.
Now we want to calculate quantization error for each cell. For the
sake of simplicity, let’s consider only one cell having centroid
A and m data points \(F_i\) for calculating quantization
error.
For each point, we calculate the distance between the point and the centroid.
\[ d = ||A - F_i||_{p} \]
In the above equation, p = 1 means L1_Norm distance
whereas p = 2 means L2_Norm distance. In the package, the
L1_Norm distance is chosen by default. The user can pass
either L1_Norm, L2_Norm or a custom function
to calculate the distance between two points in n dimensions.
\[QE = \max_i(||A-F_i||_{p})\]
Now, we take the maximum calculated distance of all m points. This
gives us the furthest distance of a point in the cell from the centroid,
which we refer to as Quantization Error. If the
Quantization Error is higher than the given threshold, the centroid/
codevector is not a good representation for the points in the cell. Now
we can perform further Vector Quantization on these points and repeat
the above steps.
Please note that the user can select mean, max or any custom function
to calculate the Quantization Error. The custom function takes a vector
of m value (where each value is a distance between point in
n dimensions and centroids) and returns a single value
which is the Quantization Error for the cell.
If we select mean as the error metric, the above
Quantization Error equation will look like this :
\[QE = \frac{1}{m}\sum_{i=1}^m||A-F_i||_{p}\]
A Voronoi diagram is a way of dividing space into a number of regions. A set of points (called seeds, sites, or generators) is specified beforehand and for each seed, there will be a corresponding region consisting of all points within proximity of that seed. These regions are called Voronoi cells. It is complementary to Delaunay triangulation.
Sammon’s projection is an algorithm that maps a high-dimensional space to a space of lower dimensionality while attempting to preserve the structure of inter-point distances in the projection. It is particularly suited for use in exploratory data analysis and is usually considered a non-linear approach since the mapping cannot be represented as a linear combination of the original variables. The centroids are plotted in 2D after performing Sammon’s projection at every level of the tessellation.
Denoting the distance between \(i^{th}\) and \(j^{th}\) objects in the original space by \(d_{ij}^*\), and the distance between their projections by \(d_{ij}\). Sammon’s mapping aims to minimize the below error function, which is often referred to as Sammon’s stress or Sammon’s error
\[E=\frac{1}{\sum_{i<j} d_{ij}^*}\sum_{i<j}\frac{(d_{ij}^*-d_{ij})^2}{d_{ij}^*}\]
The minimization of this can be performed either by gradient descent, as proposed initially, or by other means, usually involving iterative methods. The number of iterations need to be experimentally determined and convergent solutions are not always guaranteed. Many implementations prefer to use the first Principal Components as a starting configuration.
In this package, we use sammons from the package
MASS to project higher dimensional data to a 2D space. The
function hvq called from the HVT function
returns hierarchical quantized data which will be the input for
construction of the tessellations. The data is then represented in 2D
coordinates and the tessellations are plotted using these coordinates as
centroids. We use the package deldir for this purpose. The
deldir package computes the Delaunay triangulation (and
hence the Dirichlet or Voronoi tessellation) of a planar point set
according to the second (iterative) algorithm of Lee and Schacter. For
subsequent levels, transformation is performed on the 2D coordinates to
get all the points within its parent tile. Tessellations are plotted
using these transformed points as centroids. The lines in the
tessellations are chopped in places so that they do not protrude outside
the parent polygon. This is done for all the subsequent levels.
In this section, we will use the
Prices of Personal Computers dataset. This dataset contains
6259 observations and 10 features. The dataset observes the price from
1993 to 1995 of 486 personal computers in the US. The variables are
price, speed, ram, screen, cd, etc. The dataset can be downloaded from
here.
In this example, we will compress this dataset by using hierarchical VQ via k-means and visualize the Voronoi Tessellation plots using Sammons projection. Later on, we will overlay price and speed variables as a heatmap to generate further insights.
Here, we load the data and store into a variable
computers.
set.seed(240)
# Load data from csv files
computers <- read.csv("https://raw.githubusercontent.com/Mu-Sigma/muHVT/dev/vignettes/sample_dataset/Computers.csv")Let’s have a look at some of the data
# Quick peek
Table(head(computers))| X | price | speed | hd | ram | screen | cd | multi | premium | ads | trend |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1499 | 25 | 80 | 4 | 14 | no | no | yes | 94 | 1 |
| 2 | 1795 | 33 | 85 | 2 | 14 | no | no | yes | 94 | 1 |
| 3 | 1595 | 25 | 170 | 4 | 15 | no | no | yes | 94 | 1 |
| 4 | 1849 | 25 | 170 | 8 | 14 | no | no | no | 94 | 1 |
| 5 | 3295 | 33 | 340 | 16 | 14 | no | no | yes | 94 | 1 |
| 6 | 3695 | 66 | 340 | 16 | 14 | no | no | yes | 94 | 1 |
Now let us check the structure of the data
str(computers)
#> 'data.frame': 6259 obs. of 11 variables:
#> $ X : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ price : int 1499 1795 1595 1849 3295 3695 1720 1995 2225 2575 ...
#> $ speed : int 25 33 25 25 33 66 25 50 50 50 ...
#> $ hd : int 80 85 170 170 340 340 170 85 210 210 ...
#> $ ram : int 4 2 4 8 16 16 4 2 8 4 ...
#> $ screen : int 14 14 15 14 14 14 14 14 14 15 ...
#> $ cd : chr "no" "no" "no" "no" ...
#> $ multi : chr "no" "no" "no" "no" ...
#> $ premium: chr "yes" "yes" "yes" "no" ...
#> $ ads : int 94 94 94 94 94 94 94 94 94 94 ...
#> $ trend : int 1 1 1 1 1 1 1 1 1 1 ...Let’s get a summary of the data
summary(computers)
#> X price speed hd
#> Min. : 1 Min. : 949 Min. : 25.00 Min. : 80.0
#> 1st Qu.:1566 1st Qu.:1794 1st Qu.: 33.00 1st Qu.: 214.0
#> Median :3130 Median :2144 Median : 50.00 Median : 340.0
#> Mean :3130 Mean :2220 Mean : 52.01 Mean : 416.6
#> 3rd Qu.:4694 3rd Qu.:2595 3rd Qu.: 66.00 3rd Qu.: 528.0
#> Max. :6259 Max. :5399 Max. :100.00 Max. :2100.0
#> ram screen cd multi
#> Min. : 2.000 Min. :14.00 Length:6259 Length:6259
#> 1st Qu.: 4.000 1st Qu.:14.00 Class :character Class :character
#> Median : 8.000 Median :14.00 Mode :character Mode :character
#> Mean : 8.287 Mean :14.61
#> 3rd Qu.: 8.000 3rd Qu.:15.00
#> Max. :32.000 Max. :17.00
#> premium ads trend
#> Length:6259 Min. : 39.0 Min. : 1.00
#> Class :character 1st Qu.:162.5 1st Qu.:10.00
#> Mode :character Median :246.0 Median :16.00
#> Mean :221.3 Mean :15.93
#> 3rd Qu.:275.0 3rd Qu.:21.50
#> Max. :339.0 Max. :35.00Let us first split the data into train and test. We will use 80% of the data as train and remaining as test.
noOfPoints <- dim(computers)[1]
trainLength <- as.integer(noOfPoints * 0.8)
trainComputers <- computers[1:trainLength,]
testComputers <- computers[(trainLength+1):noOfPoints,]K-means is not suitable for factor variables as the sample space for factor variables is discrete. A Euclidean distance function on such a space isn’t really meaningful. Hence, we will delete the factor variables in our dataset.
Here we keep the original trainComputers and
testComputers as we will use the price variable from this
dataset to overlay as heatmap and generate some insights.
trainComputers <-
trainComputers %>% dplyr::select(-c(X, cd, multi, premium, trend))
testComputers <-
testComputers %>% dplyr::select(-c(X, cd, multi, premium, trend))Let us try to understand the HVT function first.
HVT(
dataset,
n_cells,
depth,
quant.err,
projection.scale,
normalize = T,
distance_metric = c("L1_Norm", "L2_Norm"),
error_metric = c("mean", "max"),
quant_method = c("kmeans", "kmedoids"),
diagnose = TRUE,
hvt_validation = FALSE,
train_validation_split_ratio = 0.8
)Each of the parameters have been explained below :
dataset - A dataframe with numeric
columns
n_cells - An integer indicating the
number of cells per hierarchy (level)
depth - An integer indicating the
number of levels. (1 = No hierarchy, 2 = 2 levels, etc …)
quant.error - A number indicating
the quantization error threshold. A cell will only breakdown into
further cells if the quantization error of the cell is above the defined
quantization error threshold
distance_metric - The distance
metric can be L1_Norm or L2_Norm.
L1_Norm is selected by default. The distance metric is used
to calculate the distance between an n dimensional point
and centroid. The user can also pass a custom function to calculate this
distance
error_metric - The error metric can
be mean or max. max is selected
by default. max will return the max of m
values and mean will take mean of m values
where each value is a distance between a point and centroid of the cell.
Moreover, the user can also pass a custom function to calculate the
error metric
quant_method - The quantization
method can be kmeans or kmedoids.
kmeans is selected by default
normalize - A logical value
indicating whether the columns in your dataset need to be normalized.
Default value is TRUE. The algorithm supports Z-score
normalization
diagnose - A logical value
indicating whether user wants to perform diagnostics on the model.
Default value is TRUE.
hvt_validation - A logical value
indicating whether user wants to holdout a validation set and find mean
absolute deviation of the validation points from the centroid. Default
value is FALSE.
train_validation_split_ratio - A
numeric value indicating train validation split ratio. This argument is
only used when hvt_validation has been set to TRUE. Default value for
the argument is 0.8
First we will perform hierarchical Vector Quantization at level 1 by setting the parameter depth to 1 and the number of cells to 15. Here, level 1 signifies no hierarchy.
set.seed(240)
hvt.results <- list()
hvt.results <- muHVT::HVT(trainComputers,
n_cells = 15,
depth = 1,
quant.err = 0.2,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans")Now let’s try to understand plotHVT function. The parameters have been explained in detail below
plotHVT(hvt.results, line.width, color.vec, pch1 = 21, centroid.size = 3, title = NULL, maxDepth = 1)hvt.results - A list containing the
output of the HVT function which has the details of the tessellations to
be plotted
line.width - A vector indicating
the line widths of the tessellation boundaries for each level
color.vec - A vector indicating the
colors of the tessellations boundaries at each level
pch1 - Symbol type of the centroids
of the tessellations (parent levels). Refer points (default =
21)
centroid.size - Size of centroids
of first level tessellations (default = 3)
title - Set a title for the plot
(default = NULL)
Let’s plot the Voronoi tessellation
# Voronoi tessellation plot for level one
muHVT::plotHVT(hvt.results,
line.width = c(0.6),
color.vec = c("#141B41"),
centroid.size = 1.5,
maxDepth = 1)Figure 3: The Voronoi Tessellation for level 1 shown for the 15 cells in the dataset ’computers’
As per the manual, hvt.results[[3]]
gives us detailed information about the hierarchical vector quantized
data.
hvt.results[[3]][['summary']] gives a
nice tabular data containing no of points, Quantization Error and the
codebook.
Now let us understand what each column in the above table means:
Segment.Level - Level of the cell.
In this case, we have performed Vector Quantization for depth 1. Hence
Segment Level is 1
Segment.Parent - Parent segment of
the cell
Segment.Child (Cell.Number) - The
children of a particular cell. In this case, it is the total number of
cells at which we achieved the defined compression percentage
n - No of points in each
cell
Cell.ID - Cell_ID’s are generated
for the multivariate data using 1-D Sammon’s Projection
algorithm
Quant.Error - Quantization Error
for each cell
All the columns after this will contain centroids for each cell. They can also be called a codebook, which represents a collection of all centroids or codewords.
summaryTable(hvt.results[[3]][['summary']])| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 480 | 10 | 0.38 | 0.69 | 0.70 | 0.24 | -0.02 | 0.06 | 0.57 |
| 1 | 1 | 2 | 390 | 11 | 0.57 | 0.83 | 0.21 | 0.05 | 0.10 | 2.88 | 0.10 |
| 1 | 1 | 3 | 145 | 7 | 0.41 | 0.27 | 2.67 | 0.17 | -0.20 | -0.17 | 0.71 |
| 1 | 1 | 4 | 505 | 6 | 0.3 | -0.17 | -0.80 | 0.24 | -0.04 | -0.31 | 0.42 |
| 1 | 1 | 5 | 241 | 5 | 0.32 | -0.34 | 0.66 | -0.73 | -0.75 | -0.40 | -0.40 |
| 1 | 1 | 6 | 150 | 14 | 0.57 | 0.90 | -0.55 | 2.71 | 2.32 | 0.29 | -0.60 |
| 1 | 1 | 7 | 286 | 12 | 0.27 | 0.75 | -0.71 | 0.79 | 1.61 | -0.41 | 0.35 |
| 1 | 1 | 8 | 258 | 8 | 0.35 | -0.39 | 0.76 | 0.71 | 0.00 | -0.16 | -0.54 |
| 1 | 1 | 9 | 324 | 3 | 0.3 | -1.08 | -0.79 | -0.56 | -0.69 | -0.38 | -0.76 |
| 1 | 1 | 10 | 401 | 4 | 0.34 | -0.54 | 0.56 | -0.62 | -0.76 | -0.32 | 0.76 |
| 1 | 1 | 11 | 288 | 13 | 0.4 | 1.19 | 1.24 | 0.74 | 1.61 | 0.13 | 0.38 |
| 1 | 1 | 12 | 917 | 2 | 0.26 | -0.98 | -0.91 | -0.82 | -0.77 | -0.44 | 0.55 |
| 1 | 1 | 13 | 229 | 9 | 0.52 | 1.09 | 0.33 | -0.16 | 0.33 | -0.15 | -1.94 |
| 1 | 1 | 14 | 97 | 15 | 0.66 | 2.01 | 1.24 | 3.36 | 2.46 | 0.20 | 0.01 |
| 1 | 1 | 15 | 296 | 1 | 0.34 | -0.33 | -0.53 | -0.81 | -0.51 | -0.43 | -2.16 |
Let’s have a look at Quant.Error variable in the above
table. It seems that none of the cells have hit the quantization
threshold error.
Now let’s check the compression summary. The table below shows no of cells, no of cells having quantization error below threshold and percentage of cells having quantization error below threshold for each level.
compressionSummaryTable(hvt.results[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 15 | 0 | 0 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
As it can be seen in the table above, percentage of cells in level 1
having Quantization Error below threshold is 0%. Hence, we
can go one level deeper and try to compress it further.
We will now overlay the Quant.Error variable as heatmap
over the Voronoi Tessellation plot to visualize the quantization error
better.
Let’s have look at the function hvtHmap which we will
use to overlay a variable as heatmap.
hvtHmap(hvt.results, dataset, child.level, hmap.cols, color.vec ,line.width, palette.color = 6)hvt.results - A list of hvt.results
obtained from the HVT function
dataset - A dataframe containing
the variables to overlay as a heatmap. The user can pass an external
dataset or the dataset that was used to perform hierarchical vector
quantization. The dataset should have the same number of points as the
dataset used to perform hierarchical Vector Quantization in the HVT
function
child.level - A number indicating
the level for which the heat map is to be plotted
hmap.cols - The column number of
column name from the dataset indicating the variables for which the heat
map is to be plotted. To plot the quantization error as heatmap, pass
'quant_error'. Similarly to plot the no of points in each
cell as heatmap, pass 'no_of_points' as a
parameter
color.vec - A color vector such
that length(color.vec) = child.level (default = NULL)
line.width - A line width vector
such that length(line.width) = child.level (default = NULL)
palette.color - A number indicating
the heat map color palette. 1 - rainbow, 2 - heat.colors, 3 -
terrain.colors, 4 - topo.colors, 5 - cm.colors, 6 - BlCyGrYlRd
(Blue,Cyan,Green,Yellow,Red) color (default = 6)
show.points - A boolean indicating
whether the centroids should be plotted on the tessellations (default =
FALSE)
Now let’s plot the quantization error for each cell at level one as a heatmap.
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "Quant.Error",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 1.5,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 4: The Voronoi Tessellation with the heat map overlaid for variable ’quant_error’ in the ’computers’ dataset
Now let’s go one level deeper and perform hierarchical vector quantization.
set.seed(240)
hvt.results2 <- list()
# depth=2 is used for level2 in the hierarchy
hvt.results2 <- muHVT::HVT(
trainComputers,
n_cells = 15,
depth = 2,
quant.err = 0.2,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)Let’s plot the Voronoi tessellation for both the levels.
# Voronoi tessellation plot for level two
muHVT::plotHVT(
hvt.results2,
line.width = c(0.6, 0.4),
color.vec = c("#141B41", "#0582CA"),
centroid.size = 1.5,
maxDepth = 2
)Figure 5: The Voronoi Tessellation for level 2 shown for the 225 cells in the dataset ’computers’
In the table below, Segment Level signifies the depth.
Level 1 has 15 cells
Level 2 has 225 cells .i.e. each cell in level 1 is divided into 15 cells each
Let’s analyze the summary table again for Quant.Error
and see if any of the cells in the 2nd level have Quantization Error
below the Quantization Error threshold. In the table below, the values
for Quant.Error of the cells which have hit the
Quantization Error threshold are shown in red. Here we are showing just
top 50 rows for the sake of brevity.
summaryTable(hvt.results2[[3]][['summary']],limit = 50)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 480 | 128 | 0.38 | 0.69 | 0.70 | 0.24 | -0.02 | 0.06 | 0.57 |
| 1 | 1 | 2 | 390 | 156 | 0.57 | 0.83 | 0.21 | 0.05 | 0.10 | 2.88 | 0.10 |
| 1 | 1 | 3 | 145 | 129 | 0.41 | 0.27 | 2.67 | 0.17 | -0.20 | -0.17 | 0.71 |
| 1 | 1 | 4 | 505 | 91 | 0.3 | -0.17 | -0.80 | 0.24 | -0.04 | -0.31 | 0.42 |
| 1 | 1 | 5 | 241 | 62 | 0.32 | -0.34 | 0.66 | -0.73 | -0.75 | -0.40 | -0.40 |
| 1 | 1 | 6 | 150 | 198 | 0.57 | 0.90 | -0.55 | 2.71 | 2.32 | 0.29 | -0.60 |
| 1 | 1 | 7 | 286 | 154 | 0.27 | 0.75 | -0.71 | 0.79 | 1.61 | -0.41 | 0.35 |
| 1 | 1 | 8 | 258 | 113 | 0.35 | -0.39 | 0.76 | 0.71 | 0.00 | -0.16 | -0.54 |
| 1 | 1 | 9 | 324 | 31 | 0.3 | -1.08 | -0.79 | -0.56 | -0.69 | -0.38 | -0.76 |
| 1 | 1 | 10 | 401 | 56 | 0.34 | -0.54 | 0.56 | -0.62 | -0.76 | -0.32 | 0.76 |
| 1 | 1 | 11 | 288 | 182 | 0.4 | 1.19 | 1.24 | 0.74 | 1.61 | 0.13 | 0.38 |
| 1 | 1 | 12 | 917 | 23 | 0.26 | -0.98 | -0.91 | -0.82 | -0.77 | -0.44 | 0.55 |
| 1 | 1 | 13 | 229 | 131 | 0.52 | 1.09 | 0.33 | -0.16 | 0.33 | -0.15 | -1.94 |
| 1 | 1 | 14 | 97 | 212 | 0.66 | 2.01 | 1.24 | 3.36 | 2.46 | 0.20 | 0.01 |
| 1 | 1 | 15 | 296 | 22 | 0.34 | -0.33 | -0.53 | -0.81 | -0.51 | -0.43 | -2.16 |
| 2 | 1 | 1 | 26 | 142 | 0.13 | 1.23 | 0.88 | 0.18 | 0.06 | 0.55 | 0.82 |
| 2 | 1 | 2 | 45 | 138 | 0.12 | 0.37 | 0.93 | 0.60 | 0.05 | 0.55 | 0.53 |
| 2 | 1 | 3 | 31 | 116 | 0.15 | 0.40 | 0.06 | 0.02 | -0.01 | 0.55 | 0.89 |
| 2 | 1 | 4 | 14 | 145 | 0.2 | 2.19 | 0.68 | 0.60 | -0.05 | -0.61 | 0.34 |
| 2 | 1 | 5 | 29 | 137 | 0.14 | 1.20 | 0.92 | 0.39 | 0.04 | -0.61 | 0.90 |
| 2 | 1 | 6 | 32 | 120 | 0.13 | 0.23 | 1.02 | 0.38 | 0.06 | -0.61 | 1.31 |
| 2 | 1 | 7 | 28 | 124 | 0.2 | 0.91 | 0.80 | -0.02 | -0.13 | -0.61 | -0.01 |
| 2 | 1 | 8 | 39 | 118 | 0.11 | -0.03 | 0.92 | -0.06 | 0.04 | 0.55 | 0.50 |
| 2 | 1 | 9 | 48 | 111 | 0.14 | 0.53 | 0.09 | 0.18 | 0.00 | -0.61 | 0.58 |
| 2 | 1 | 10 | 32 | 158 | 0.22 | 2.19 | 0.81 | 0.66 | -0.20 | 0.55 | 0.33 |
| 2 | 1 | 11 | 36 | 122 | 0.14 | 0.40 | 0.09 | -0.01 | 0.04 | 0.55 | 0.18 |
| 2 | 1 | 12 | 26 | 140 | 0.15 | 0.95 | 0.92 | 0.34 | 0.03 | 0.55 | -0.07 |
| 2 | 1 | 13 | 22 | 132 | 0.13 | -0.02 | 0.86 | 0.46 | 0.06 | 0.55 | 1.43 |
| 2 | 1 | 14 | 22 | 123 | 0.18 | 0.92 | 0.80 | -0.51 | -0.43 | 0.55 | 0.37 |
| 2 | 1 | 15 | 50 | 119 | 0.13 | 0.38 | 0.93 | 0.32 | -0.05 | -0.61 | 0.55 |
| 2 | 2 | 1 | 24 | 190 | 0.22 | 2.32 | 0.92 | 0.32 | -0.03 | 2.88 | 0.53 |
| 2 | 2 | 2 | 12 | 201 | 0.37 | 1.25 | -0.49 | 1.29 | 1.37 | 2.88 | -1.03 |
| 2 | 2 | 3 | 19 | 155 | 0.26 | 1.25 | 0.31 | -0.31 | -0.43 | 2.88 | 0.80 |
| 2 | 2 | 4 | 6 | 217 | 0.37 | 3.80 | 2.08 | 1.45 | 0.06 | 2.88 | -0.25 |
| 2 | 2 | 5 | 16 | 146 | 0.21 | 0.73 | -0.67 | -0.52 | -0.13 | 2.88 | -1.75 |
| 2 | 2 | 6 | 19 | 195 | 0.36 | 0.76 | 0.96 | 1.19 | 1.22 | 2.88 | -0.33 |
| 2 | 2 | 7 | 56 | 5 | 0.23 | -0.52 | -0.90 | -0.50 | -0.71 | 2.88 | 0.40 |
| 2 | 2 | 8 | 63 | 133 | 0.26 | 0.46 | -0.80 | -0.11 | -0.23 | 2.88 | 0.51 |
| 2 | 2 | 9 | 23 | 175 | 0.21 | 1.21 | 0.52 | -0.54 | -0.14 | 2.88 | -2.06 |
| 2 | 2 | 10 | 17 | 130 | 0.2 | -0.04 | 0.19 | -0.12 | -0.30 | 2.88 | 0.46 |
| 2 | 2 | 11 | 43 | 202 | 0.28 | 2.40 | 0.72 | 0.26 | 1.63 | 2.88 | 0.29 |
| 2 | 2 | 12 | 20 | 134 | 0.24 | -0.28 | 0.71 | 0.02 | -0.33 | 2.88 | -0.46 |
| 2 | 2 | 13 | 48 | 159 | 0.18 | 0.71 | 0.80 | 0.14 | 0.02 | 2.88 | 0.39 |
| 2 | 2 | 14 | 13 | 197 | 0.28 | 0.96 | 2.67 | 0.49 | 0.55 | 2.88 | 0.56 |
| 2 | 2 | 15 | 11 | 174 | 0.17 | 0.65 | 0.77 | 0.32 | 0.06 | 2.88 | 1.34 |
| 2 | 3 | 1 | 6 | 189 | 0.18 | 1.53 | 2.67 | 1.98 | 0.06 | -0.61 | -0.49 |
| 2 | 3 | 2 | 14 | 115 | 0.19 | 0.08 | 2.67 | 0.25 | -0.27 | -0.53 | -0.41 |
| 2 | 3 | 3 | 8 | 9 | 0.16 | -0.26 | 2.67 | -0.73 | -0.82 | 0.55 | 1.02 |
| 2 | 3 | 4 | 14 | 139 | 0.11 | 0.37 | 2.67 | 0.59 | 0.06 | -0.61 | 0.33 |
| 2 | 3 | 5 | 27 | 141 | 0.1 | 0.50 | 2.67 | 0.32 | 0.06 | -0.61 | 1.33 |
The users can look at the compression summary to get a quick summary on the compression as it becomes quite cumbersome to look at the summary table above as we go deeper.
compressionSummaryTable(hvt.results2[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 15 | 0 | 0 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
| 2 | 207 | 176 | 0.85 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
As it can be seen in the table above, only 85% cells in
the 2nd level have Quantization Error below threshold. Therefore, we can
go another level deeper and try to compress the data further.
We will look at the heatmap for Quantization Error for level 2.
muHVT::hvtHmap(
hvt.results2,
trainComputers,
child.level = 2,
hmap.cols = "Quant.Error",
line.width = c(0.6, 0.4),
color.vec = c("#141B41", "#0582CA"),
palette.color = 6,
centroid.size = 2,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 6: The Voronoi Tessellation with the heat map overlaid for variable ’quant_error’ in the ’computers’ dataset
As the Quantization Error criteria is not met, let’s perform hierarchical Vector Quantization at level 3.
set.seed(240)
hvt.results3 <- list()
# depth=3 is used for level3 in the hierarchy
hvt.results3 <- muHVT::HVT(
trainComputers,
n_cells = 15,
depth = 3,
quant.err = 0.2,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)Let’s plot the Voronoi Tessellation for all 3 levels.
# Voronoi tessellation plot for level three
muHVT::plotHVT(
hvt.results3,
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#0582CA", "#8BA0B4"),
centroid.size = 1.5,
maxDepth = 3
)Figure 7: The Voronoi Tessellation for level 3 shown for the 1905 cells in the dataset ’computers’
Each of the 225 cells whose quantization is above the defined threshold in level 2 will break down into 15 cells each in level 3. Hence, as it can be seen below, level 3 has 3375 rows. So it will have 3615 rows in total. We will only show first 500 rows here.
summaryTable(hvt.results3[[3]][['summary']],scroll = T,limit = 500)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 480 | 156 | 0.38 | 0.69 | 0.70 | 0.24 | -0.02 | 0.06 | 0.57 |
| 1 | 1 | 2 | 390 | 198 | 0.57 | 0.83 | 0.21 | 0.05 | 0.10 | 2.88 | 0.10 |
| 1 | 1 | 3 | 145 | 142 | 0.41 | 0.27 | 2.67 | 0.17 | -0.20 | -0.17 | 0.71 |
| 1 | 1 | 4 | 505 | 107 | 0.3 | -0.17 | -0.80 | 0.24 | -0.04 | -0.31 | 0.42 |
| 1 | 1 | 5 | 241 | 67 | 0.32 | -0.34 | 0.66 | -0.73 | -0.75 | -0.40 | -0.40 |
| 1 | 1 | 6 | 150 | 270 | 0.57 | 0.90 | -0.55 | 2.71 | 2.32 | 0.29 | -0.60 |
| 1 | 1 | 7 | 286 | 197 | 0.27 | 0.75 | -0.71 | 0.79 | 1.61 | -0.41 | 0.35 |
| 1 | 1 | 8 | 258 | 139 | 0.35 | -0.39 | 0.76 | 0.71 | 0.00 | -0.16 | -0.54 |
| 1 | 1 | 9 | 324 | 32 | 0.3 | -1.08 | -0.79 | -0.56 | -0.69 | -0.38 | -0.76 |
| 1 | 1 | 10 | 401 | 62 | 0.34 | -0.54 | 0.56 | -0.62 | -0.76 | -0.32 | 0.76 |
| 1 | 1 | 11 | 288 | 234 | 0.4 | 1.19 | 1.24 | 0.74 | 1.61 | 0.13 | 0.38 |
| 1 | 1 | 12 | 917 | 23 | 0.26 | -0.98 | -0.91 | -0.82 | -0.77 | -0.44 | 0.55 |
| 1 | 1 | 13 | 229 | 159 | 0.52 | 1.09 | 0.33 | -0.16 | 0.33 | -0.15 | -1.94 |
| 1 | 1 | 14 | 97 | 301 | 0.66 | 2.01 | 1.24 | 3.36 | 2.46 | 0.20 | 0.01 |
| 1 | 1 | 15 | 296 | 26 | 0.34 | -0.33 | -0.53 | -0.81 | -0.51 | -0.43 | -2.16 |
| 2 | 1 | 1 | 26 | 177 | 0.13 | 1.23 | 0.88 | 0.18 | 0.06 | 0.55 | 0.82 |
| 2 | 1 | 2 | 45 | 167 | 0.12 | 0.37 | 0.93 | 0.60 | 0.05 | 0.55 | 0.53 |
| 2 | 1 | 3 | 31 | 141 | 0.15 | 0.40 | 0.06 | 0.02 | -0.01 | 0.55 | 0.89 |
| 2 | 1 | 4 | 14 | 188 | 0.2 | 2.19 | 0.68 | 0.60 | -0.05 | -0.61 | 0.34 |
| 2 | 1 | 5 | 29 | 166 | 0.14 | 1.20 | 0.92 | 0.39 | 0.04 | -0.61 | 0.90 |
| 2 | 1 | 6 | 32 | 138 | 0.13 | 0.23 | 1.02 | 0.38 | 0.06 | -0.61 | 1.31 |
| 2 | 1 | 7 | 28 | 149 | 0.2 | 0.91 | 0.80 | -0.02 | -0.13 | -0.61 | -0.01 |
| 2 | 1 | 8 | 39 | 144 | 0.11 | -0.03 | 0.92 | -0.06 | 0.04 | 0.55 | 0.50 |
| 2 | 1 | 9 | 48 | 132 | 0.14 | 0.53 | 0.09 | 0.18 | 0.00 | -0.61 | 0.58 |
| 2 | 1 | 10 | 32 | 206 | 0.22 | 2.19 | 0.81 | 0.66 | -0.20 | 0.55 | 0.33 |
| 2 | 1 | 11 | 36 | 148 | 0.14 | 0.40 | 0.09 | -0.01 | 0.04 | 0.55 | 0.18 |
| 2 | 1 | 12 | 26 | 174 | 0.15 | 0.95 | 0.92 | 0.34 | 0.03 | 0.55 | -0.07 |
| 2 | 1 | 13 | 22 | 152 | 0.13 | -0.02 | 0.86 | 0.46 | 0.06 | 0.55 | 1.43 |
| 2 | 1 | 14 | 22 | 146 | 0.18 | 0.92 | 0.80 | -0.51 | -0.43 | 0.55 | 0.37 |
| 2 | 1 | 15 | 50 | 143 | 0.13 | 0.38 | 0.93 | 0.32 | -0.05 | -0.61 | 0.55 |
| 2 | 2 | 1 | 24 | 249 | 0.22 | 2.32 | 0.92 | 0.32 | -0.03 | 2.88 | 0.53 |
| 2 | 2 | 2 | 12 | 267 | 0.37 | 1.25 | -0.49 | 1.29 | 1.37 | 2.88 | -1.03 |
| 2 | 2 | 3 | 19 | 191 | 0.26 | 1.25 | 0.31 | -0.31 | -0.43 | 2.88 | 0.80 |
| 2 | 2 | 4 | 6 | 307 | 0.37 | 3.80 | 2.08 | 1.45 | 0.06 | 2.88 | -0.25 |
| 2 | 2 | 5 | 16 | 169 | 0.21 | 0.73 | -0.67 | -0.52 | -0.13 | 2.88 | -1.75 |
| 2 | 2 | 6 | 19 | 261 | 0.36 | 0.76 | 0.96 | 1.19 | 1.22 | 2.88 | -0.33 |
| 2 | 2 | 7 | 56 | 58 | 0.23 | -0.52 | -0.90 | -0.50 | -0.71 | 2.88 | 0.40 |
| 2 | 2 | 8 | 63 | 160 | 0.26 | 0.46 | -0.80 | -0.11 | -0.23 | 2.88 | 0.51 |
| 2 | 2 | 9 | 23 | 209 | 0.21 | 1.21 | 0.52 | -0.54 | -0.14 | 2.88 | -2.06 |
| 2 | 2 | 10 | 17 | 158 | 0.2 | -0.04 | 0.19 | -0.12 | -0.30 | 2.88 | 0.46 |
| 2 | 2 | 11 | 43 | 274 | 0.28 | 2.40 | 0.72 | 0.26 | 1.63 | 2.88 | 0.29 |
| 2 | 2 | 12 | 20 | 162 | 0.24 | -0.28 | 0.71 | 0.02 | -0.33 | 2.88 | -0.46 |
| 2 | 2 | 13 | 48 | 202 | 0.18 | 0.71 | 0.80 | 0.14 | 0.02 | 2.88 | 0.39 |
| 2 | 2 | 14 | 13 | 262 | 0.28 | 0.96 | 2.67 | 0.49 | 0.55 | 2.88 | 0.56 |
| 2 | 2 | 15 | 11 | 214 | 0.17 | 0.65 | 0.77 | 0.32 | 0.06 | 2.88 | 1.34 |
| 2 | 3 | 1 | 6 | 254 | 0.18 | 1.53 | 2.67 | 1.98 | 0.06 | -0.61 | -0.49 |
| 2 | 3 | 2 | 14 | 126 | 0.19 | 0.08 | 2.67 | 0.25 | -0.27 | -0.53 | -0.41 |
| 2 | 3 | 3 | 8 | 16 | 0.16 | -0.26 | 2.67 | -0.73 | -0.82 | 0.55 | 1.02 |
| 2 | 3 | 4 | 14 | 163 | 0.11 | 0.37 | 2.67 | 0.59 | 0.06 | -0.61 | 0.33 |
| 2 | 3 | 5 | 27 | 154 | 0.1 | 0.50 | 2.67 | 0.32 | 0.06 | -0.61 | 1.33 |
| 2 | 3 | 6 | 46 | 179 | 0.18 | 0.39 | 2.67 | 0.47 | 0.06 | 0.55 | 0.72 |
| 2 | 3 | 7 | 10 | 36 | 0.11 | -0.27 | 2.67 | -0.56 | -0.72 | -0.61 | 0.42 |
| 2 | 3 | 8 | 20 | 4 | 0.16 | -0.12 | 2.67 | -0.91 | -0.89 | -0.61 | 1.27 |
| 2 | 3 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 3 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 4 | 1 | 32 | 95 | 0.13 | 0.15 | -0.83 | -0.35 | 0.06 | -0.61 | 0.20 |
| 2 | 4 | 2 | 14 | 120 | 0.13 | -0.48 | -0.81 | 0.47 | 0.06 | 0.55 | -0.54 |
| 2 | 4 | 3 | 59 | 99 | 0.18 | -0.50 | -0.80 | -0.08 | -0.03 | 0.55 | 0.70 |
| 2 | 4 | 4 | 20 | 80 | 0.08 | -0.39 | -0.91 | 0.43 | 0.06 | -0.61 | 1.52 |
| 2 | 4 | 5 | 17 | 118 | 0.1 | 0.10 | -0.86 | 0.45 | 0.06 | -0.61 | -0.49 |
| 2 | 4 | 6 | 26 | 116 | 0.08 | -0.25 | -0.83 | 0.83 | 0.06 | -0.61 | 0.28 |
| 2 | 4 | 7 | 58 | 94 | 0.1 | -0.28 | -0.96 | 0.34 | 0.06 | -0.61 | 0.79 |
| 2 | 4 | 8 | 19 | 108 | 0.05 | -0.52 | -0.80 | 0.82 | 0.06 | -0.61 | -0.71 |
| 2 | 4 | 9 | 48 | 129 | 0.14 | -0.14 | -0.85 | 0.50 | 0.05 | 0.55 | 0.46 |
| 2 | 4 | 10 | 29 | 76 | 0.12 | -0.77 | -0.77 | -0.01 | 0.06 | -0.61 | 0.52 |
| 2 | 4 | 11 | 40 | 100 | 0.11 | -0.35 | -0.94 | 0.29 | 0.06 | -0.61 | 0.04 |
| 2 | 4 | 12 | 58 | 117 | 0.08 | 0.20 | -0.80 | 0.35 | 0.06 | -0.61 | 0.51 |
| 2 | 4 | 13 | 25 | 65 | 0.13 | -0.23 | -0.80 | 0.10 | -0.72 | -0.61 | 0.66 |
| 2 | 4 | 14 | 26 | 119 | 0.12 | -0.08 | 0.09 | 0.32 | 0.03 | -0.61 | 0.71 |
| 2 | 4 | 15 | 34 | 101 | 0.23 | 0.70 | -0.80 | -0.24 | -0.65 | -0.24 | 0.19 |
| 2 | 5 | 1 | 16 | 55 | 0.09 | -1.06 | 0.92 | -0.54 | -0.72 | -0.61 | -0.65 |
| 2 | 5 | 2 | 19 | 88 | 0.2 | -0.27 | 0.66 | -0.78 | -0.74 | 0.55 | -0.08 |
| 2 | 5 | 3 | 22 | 53 | 0.09 | -0.25 | 0.92 | -1.16 | -1.02 | -0.61 | -0.02 |
| 2 | 5 | 4 | 10 | 61 | 0.09 | -0.21 | 0.09 | -0.66 | -0.64 | -0.61 | -1.10 |
| 2 | 5 | 5 | 9 | 50 | 0.1 | -0.72 | 0.09 | -0.66 | -0.76 | -0.61 | -0.71 |
| 2 | 5 | 6 | 13 | 39 | 0.13 | -0.79 | 0.92 | -0.91 | -0.78 | -0.61 | -1.27 |
| 2 | 5 | 7 | 16 | 86 | 0.14 | -0.09 | 0.92 | -0.56 | -0.52 | -0.61 | -0.88 |
| 2 | 5 | 8 | 17 | 97 | 0.13 | -0.19 | 0.92 | -0.32 | -0.58 | -0.61 | -0.02 |
| 2 | 5 | 9 | 21 | 56 | 0.07 | -0.85 | 0.92 | -0.80 | -0.72 | -0.61 | 0.06 |
| 2 | 5 | 10 | 6 | 57 | 0.08 | -0.08 | 0.92 | -0.89 | -0.72 | -0.61 | -1.57 |
| 2 | 5 | 11 | 20 | 74 | 0.08 | 0.10 | 0.09 | -0.64 | -0.72 | -0.61 | -0.05 |
| 2 | 5 | 12 | 12 | 75 | 0.18 | -0.87 | 0.57 | -0.48 | -0.75 | 0.55 | -0.55 |
| 2 | 5 | 13 | 20 | 38 | 0.11 | -0.57 | 0.09 | -1.10 | -1.01 | -0.61 | -0.03 |
| 2 | 5 | 14 | 26 | 102 | 0.1 | 0.40 | 0.92 | -0.60 | -0.72 | -0.61 | -0.14 |
| 2 | 5 | 15 | 14 | 83 | 0.23 | -0.27 | 0.56 | -0.77 | -0.63 | 0.55 | -1.18 |
| 2 | 6 | 1 | 5 | 239 | 0.02 | 0.51 | -0.78 | 1.73 | 1.63 | 0.55 | -1.30 |
| 2 | 6 | 2 | 5 | 236 | 0.01 | 0.19 | -0.78 | 1.73 | 1.63 | 0.55 | -1.30 |
| 2 | 6 | 3 | 5 | 210 | 0.2 | 0.33 | -0.78 | 3.27 | 0.06 | -0.15 | 0.31 |
| 2 | 6 | 4 | 3 | 238 | 0.03 | 0.60 | -0.78 | 1.73 | 1.63 | 0.55 | -0.69 |
| 2 | 6 | 5 | 6 | 235 | 0.02 | 0.38 | -0.78 | 1.73 | 1.63 | 0.55 | -0.65 |
| 2 | 6 | 6 | 20 | 242 | 0.16 | 0.55 | 0.38 | 1.82 | 1.63 | 0.55 | -1.06 |
| 2 | 6 | 7 | 12 | 299 | 0.08 | 1.44 | 0.09 | 3.06 | 3.19 | 0.55 | -0.94 |
| 2 | 6 | 8 | 6 | 173 | 0.06 | -0.22 | -0.78 | 3.06 | 0.06 | -0.61 | -0.92 |
| 2 | 6 | 9 | 32 | 290 | 0.09 | 1.19 | -0.78 | 3.06 | 3.19 | -0.54 | 0.10 |
| 2 | 6 | 10 | 6 | 230 | 0.01 | 0.12 | -0.78 | 1.73 | 1.63 | 0.55 | -0.84 |
| 2 | 6 | 11 | 3 | 231 | 0.01 | 0.13 | -0.78 | 1.73 | 1.63 | 0.55 | -0.62 |
| 2 | 6 | 12 | 3 | 264 | 0.2 | 1.30 | -0.78 | 3.06 | 1.63 | -0.61 | -0.03 |
| 2 | 6 | 13 | 5 | 269 | 0.17 | 1.24 | -0.78 | 4.82 | -0.09 | -0.38 | 0.49 |
| 2 | 6 | 14 | 33 | 295 | 0.08 | 1.28 | -0.78 | 3.06 | 3.19 | 0.55 | -0.81 |
| 2 | 6 | 15 | 6 | 315 | 0.21 | 1.11 | -0.49 | 3.06 | 3.19 | 2.88 | -1.07 |
| 2 | 7 | 1 | 31 | 208 | 0.15 | 0.57 | -0.80 | 0.83 | 1.63 | 0.55 | 0.67 |
| 2 | 7 | 2 | 43 | 212 | 0.1 | 1.19 | 0.09 | 0.78 | 1.63 | -0.61 | 0.39 |
| 2 | 7 | 3 | 52 | 186 | 0.14 | 0.30 | -0.89 | 0.97 | 1.63 | -0.61 | 0.13 |
| 2 | 7 | 4 | 21 | 193 | 0.17 | 0.85 | -0.84 | 0.65 | 1.63 | -0.61 | -0.80 |
| 2 | 7 | 5 | 4 | 165 | 0.08 | 2.02 | -0.78 | 0.46 | 0.06 | -0.61 | 0.52 |
| 2 | 7 | 6 | 19 | 207 | 0.18 | 0.77 | -0.87 | 0.50 | 1.63 | 0.55 | -0.36 |
| 2 | 7 | 7 | 28 | 184 | 0.15 | 0.37 | -0.73 | 0.82 | 1.63 | -0.61 | 1.27 |
| 2 | 7 | 8 | 88 | 195 | 0.12 | 0.90 | -0.89 | 0.78 | 1.63 | -0.61 | 0.48 |
| 2 | 7 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 7 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 8 | 1 | 25 | 147 | 0.21 | -0.55 | 0.62 | 0.71 | 0.00 | 0.55 | -1.04 |
| 2 | 8 | 2 | 31 | 153 | 0.15 | -0.22 | 0.89 | 0.57 | 0.06 | 0.55 | -0.07 |
| 2 | 8 | 3 | 20 | 133 | 0.09 | -0.50 | 0.92 | 0.83 | 0.02 | -0.61 | -0.98 |
| 2 | 8 | 4 | 15 | 115 | 0.09 | -0.80 | 0.86 | 0.21 | 0.06 | -0.61 | -0.66 |
| 2 | 8 | 5 | 13 | 201 | 0.22 | -0.11 | 0.54 | 3.06 | 0.06 | -0.26 | -0.81 |
| 2 | 8 | 6 | 6 | 113 | 0.1 | -0.40 | -0.78 | 0.82 | 0.06 | -0.23 | -1.30 |
| 2 | 8 | 7 | 16 | 150 | 0.17 | 0.37 | 0.66 | 0.61 | 0.06 | -0.61 | -0.65 |
| 2 | 8 | 8 | 14 | 123 | 0.07 | -0.36 | 0.09 | 0.82 | 0.06 | -0.61 | -0.94 |
| 2 | 8 | 9 | 7 | 114 | 0.04 | -0.66 | 0.92 | 0.25 | 0.06 | -0.61 | -1.30 |
| 2 | 8 | 10 | 15 | 178 | 0.11 | -0.25 | 0.92 | 1.77 | 0.06 | 0.55 | -0.48 |
| 2 | 8 | 11 | 15 | 98 | 0.09 | -0.76 | 0.92 | 0.39 | -0.72 | -0.61 | -0.42 |
| 2 | 8 | 12 | 15 | 140 | 0.06 | -0.28 | 0.92 | 0.89 | 0.06 | -0.61 | 0.07 |
| 2 | 8 | 13 | 23 | 130 | 0.17 | -0.72 | 0.84 | 0.13 | -0.04 | 0.55 | -0.47 |
| 2 | 8 | 14 | 19 | 122 | 0.07 | -0.45 | 0.92 | 0.25 | 0.06 | -0.61 | 0.15 |
| 2 | 8 | 15 | 24 | 124 | 0.06 | -0.31 | 0.92 | 0.28 | 0.06 | -0.61 | -0.38 |
| 2 | 9 | 1 | 27 | 52 | 0.17 | -1.33 | -0.65 | -0.02 | -0.70 | 0.55 | -0.54 |
| 2 | 9 | 2 | 15 | 66 | 0.2 | -0.39 | -0.89 | -0.58 | -0.41 | 0.55 | -0.83 |
| 2 | 9 | 3 | 47 | 12 | 0.13 | -1.51 | -0.90 | -0.79 | -0.75 | -0.61 | -0.55 |
| 2 | 9 | 4 | 24 | 22 | 0.18 | -1.18 | -0.78 | -0.89 | -0.85 | 0.55 | -0.96 |
| 2 | 9 | 5 | 6 | 82 | 0.12 | -0.70 | -0.78 | 0.77 | -0.20 | -0.61 | -1.27 |
| 2 | 9 | 6 | 28 | 47 | 0.09 | -0.39 | -0.89 | -0.61 | -0.72 | -0.61 | -0.43 |
| 2 | 9 | 7 | 36 | 5 | 0.16 | -1.34 | -0.97 | -0.98 | -0.84 | -0.61 | -1.32 |
| 2 | 9 | 8 | 21 | 43 | 0.18 | -1.27 | 0.09 | -0.44 | -0.63 | -0.61 | -0.73 |
| 2 | 9 | 9 | 25 | 44 | 0.11 | -1.23 | -0.80 | 0.08 | -0.66 | -0.61 | -0.32 |
| 2 | 9 | 10 | 9 | 73 | 0.13 | -0.27 | -0.87 | -0.44 | 0.06 | -0.61 | -0.94 |
| 2 | 9 | 11 | 30 | 21 | 0.13 | -0.92 | -0.89 | -0.99 | -0.86 | -0.61 | -0.46 |
| 2 | 9 | 12 | 24 | 33 | 0.15 | -1.48 | -0.82 | -0.04 | -0.52 | -0.61 | -0.92 |
| 2 | 9 | 13 | 32 | 31 | 0.09 | -0.68 | -0.82 | -0.75 | -0.72 | -0.61 | -1.09 |
| 2 | 9 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 9 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 10 | 1 | 21 | 64 | 0.1 | -0.87 | 0.09 | 0.09 | -0.72 | -0.61 | 0.54 |
| 2 | 10 | 2 | 22 | 20 | 0.18 | -0.92 | 0.54 | -0.84 | -0.86 | -0.61 | 1.52 |
| 2 | 10 | 3 | 35 | 104 | 0.15 | -0.41 | 0.92 | -0.39 | -0.65 | 0.55 | 0.54 |
| 2 | 10 | 4 | 31 | 25 | 0.07 | -0.78 | 0.09 | -1.18 | -1.11 | -0.61 | 0.66 |
| 2 | 10 | 5 | 59 | 72 | 0.12 | -0.08 | 0.99 | -0.71 | -0.69 | -0.61 | 0.83 |
| 2 | 10 | 6 | 18 | 45 | 0.2 | -0.63 | 0.97 | -0.99 | -0.96 | 0.55 | 1.05 |
| 2 | 10 | 7 | 9 | 111 | 0.2 | -0.63 | 0.73 | 0.06 | -0.02 | 0.29 | 1.46 |
| 2 | 10 | 8 | 17 | 96 | 0.11 | -0.37 | 0.94 | 0.15 | -0.72 | -0.61 | 0.60 |
| 2 | 10 | 9 | 42 | 68 | 0.16 | -0.69 | 0.09 | -0.53 | -0.73 | 0.55 | 0.56 |
| 2 | 10 | 10 | 27 | 63 | 0.09 | -0.09 | 0.09 | -0.71 | -0.69 | -0.61 | 0.65 |
| 2 | 10 | 11 | 40 | 29 | 0.11 | -0.57 | 1.02 | -1.18 | -1.10 | -0.61 | 0.81 |
| 2 | 10 | 12 | 25 | 54 | 0.12 | -0.97 | 0.92 | -0.56 | -0.72 | -0.61 | 0.63 |
| 2 | 10 | 13 | 12 | 85 | 0.2 | -0.66 | 0.37 | -0.38 | 0.06 | -0.61 | 0.68 |
| 2 | 10 | 14 | 26 | 40 | 0.11 | -0.94 | 0.09 | -0.76 | -0.72 | -0.61 | 0.69 |
| 2 | 10 | 15 | 17 | 81 | 0.13 | -0.14 | 0.09 | 0.06 | -0.72 | -0.61 | 0.94 |
| 2 | 11 | 1 | 19 | 226 | 0.12 | 0.97 | 1.14 | 0.82 | 1.63 | -0.61 | 1.27 |
| 2 | 11 | 2 | 32 | 237 | 0.14 | 1.51 | 0.79 | 0.68 | 1.63 | 0.55 | 0.62 |
| 2 | 11 | 3 | 21 | 228 | 0.15 | 0.82 | 0.98 | 0.63 | 1.63 | 0.55 | 1.25 |
| 2 | 11 | 4 | 6 | 227 | 0.16 | 1.39 | 0.92 | 0.20 | 1.63 | 0.16 | -1.08 |
| 2 | 11 | 5 | 15 | 259 | 0.44 | 1.44 | 2.67 | 1.16 | 1.21 | 0.01 | -0.46 |
| 2 | 11 | 6 | 22 | 233 | 0.15 | 1.33 | 0.80 | 0.56 | 1.63 | 0.55 | -0.24 |
| 2 | 11 | 7 | 20 | 257 | 0.18 | 1.22 | 2.67 | 1.05 | 1.63 | -0.61 | 0.72 |
| 2 | 11 | 8 | 18 | 268 | 0.2 | 1.31 | 2.67 | 0.79 | 1.63 | 0.94 | 1.12 |
| 2 | 11 | 9 | 9 | 218 | 0.13 | 0.40 | 0.97 | 0.71 | 1.63 | 0.55 | -0.74 |
| 2 | 11 | 10 | 5 | 258 | 0.34 | 2.72 | 0.92 | 0.59 | 1.94 | 0.32 | 0.13 |
| 2 | 11 | 11 | 18 | 217 | 0.14 | 0.65 | 0.92 | 1.07 | 1.63 | -0.61 | 0.04 |
| 2 | 11 | 12 | 15 | 222 | 0.09 | 0.62 | 0.95 | 0.77 | 1.63 | 0.55 | 0.37 |
| 2 | 11 | 13 | 11 | 244 | 0.08 | 0.72 | 0.92 | 1.74 | 1.63 | 0.55 | -0.66 |
| 2 | 11 | 14 | 43 | 229 | 0.1 | 1.51 | 0.92 | 0.77 | 1.63 | -0.61 | 0.38 |
| 2 | 11 | 15 | 34 | 221 | 0.1 | 1.21 | 0.92 | 0.09 | 1.63 | 0.55 | 0.44 |
| 2 | 12 | 1 | 40 | 7 | 0.11 | -1.18 | -0.87 | -0.56 | -0.69 | -0.61 | 1.52 |
| 2 | 12 | 2 | 73 | 42 | 0.12 | -0.19 | -0.87 | -0.85 | -0.72 | -0.61 | 0.31 |
| 2 | 12 | 3 | 62 | 19 | 0.11 | -0.80 | -0.92 | -1.15 | -0.93 | -0.61 | 0.13 |
| 2 | 12 | 4 | 50 | 18 | 0.07 | -1.02 | -1.20 | -0.71 | -0.72 | -0.61 | 0.63 |
| 2 | 12 | 5 | 73 | 2 | 0.15 | -1.49 | -1.02 | -1.09 | -0.91 | -0.61 | 0.96 |
| 2 | 12 | 6 | 83 | 6 | 0.13 | -1.46 | -1.11 | -1.06 | -0.82 | -0.61 | 0.27 |
| 2 | 12 | 7 | 73 | 10 | 0.11 | -0.91 | -0.88 | -1.16 | -0.96 | -0.61 | 0.67 |
| 2 | 12 | 8 | 35 | 9 | 0.19 | -1.11 | -0.94 | -0.80 | -0.80 | 0.55 | 1.14 |
| 2 | 12 | 9 | 53 | 27 | 0.12 | -1.26 | -0.80 | -0.36 | -0.72 | -0.61 | 0.74 |
| 2 | 12 | 10 | 86 | 35 | 0.07 | -0.61 | -0.81 | -0.70 | -0.72 | -0.61 | 0.82 |
| 2 | 12 | 11 | 79 | 24 | 0.1 | -1.35 | -0.73 | -0.66 | -0.73 | -0.61 | 0.28 |
| 2 | 12 | 12 | 19 | 48 | 0.18 | -0.97 | -1.00 | -0.57 | 0.06 | -0.61 | 0.55 |
| 2 | 12 | 13 | 86 | 37 | 0.08 | -0.74 | -0.85 | -0.66 | -0.72 | -0.61 | 0.22 |
| 2 | 12 | 14 | 62 | 51 | 0.13 | -0.58 | -0.86 | -0.63 | -0.71 | 0.55 | 0.47 |
| 2 | 12 | 15 | 43 | 14 | 0.18 | -1.26 | -0.92 | -0.92 | -0.93 | 0.55 | 0.32 |
| 2 | 13 | 1 | 11 | 145 | 0.19 | 0.99 | 0.84 | -0.21 | -0.01 | -0.61 | -1.29 |
| 2 | 13 | 2 | 14 | 215 | 0.19 | 1.34 | 0.50 | 0.08 | 1.63 | -0.61 | -2.10 |
| 2 | 13 | 3 | 14 | 127 | 0.14 | 0.57 | 0.39 | -0.54 | -0.05 | 0.55 | -2.24 |
| 2 | 13 | 4 | 13 | 155 | 0.16 | 0.80 | 0.92 | -0.36 | -0.06 | 0.55 | -1.35 |
| 2 | 13 | 5 | 11 | 220 | 0.27 | 2.40 | 0.92 | 0.63 | -0.01 | 0.23 | -1.60 |
| 2 | 13 | 6 | 15 | 211 | 0.28 | 1.09 | 0.13 | -0.20 | 1.63 | 0.55 | -1.93 |
| 2 | 13 | 7 | 18 | 135 | 0.16 | 0.49 | 0.04 | -0.35 | -0.02 | 0.55 | -1.37 |
| 2 | 13 | 8 | 15 | 131 | 0.19 | 0.55 | -0.20 | 0.27 | 0.06 | -0.61 | -1.32 |
| 2 | 13 | 9 | 39 | 105 | 0.14 | 0.45 | 0.58 | -0.59 | 0.06 | -0.61 | -2.20 |
| 2 | 13 | 10 | 11 | 183 | 0.17 | 2.58 | -0.31 | 0.40 | 0.06 | -0.61 | -2.20 |
| 2 | 13 | 11 | 15 | 219 | 0.12 | 2.68 | 0.92 | 0.52 | 0.06 | -0.61 | -2.16 |
| 2 | 13 | 12 | 18 | 164 | 0.17 | 1.30 | 0.92 | -0.36 | -0.28 | 0.55 | -2.31 |
| 2 | 13 | 13 | 9 | 103 | 0.2 | 1.24 | 0.55 | -0.75 | -0.37 | -0.61 | -2.35 |
| 2 | 13 | 14 | 20 | 176 | 0.15 | 0.87 | -0.91 | 0.06 | 1.63 | -0.56 | -2.12 |
| 2 | 13 | 15 | 6 | 112 | 0.16 | 0.69 | -0.78 | -0.37 | 0.06 | 0.36 | -2.13 |
| 2 | 14 | 1 | 3 | 300 | 0.05 | 1.52 | 0.92 | 3.06 | 3.19 | 0.55 | 0.08 |
| 2 | 14 | 2 | 6 | 312 | 0.34 | 2.14 | 0.92 | 3.25 | 2.15 | 2.88 | -0.39 |
| 2 | 14 | 3 | 2 | 320 | 0.07 | 2.26 | 0.92 | 8.30 | 1.63 | 0.55 | 0.27 |
| 2 | 14 | 4 | 3 | 308 | 0.05 | 2.01 | 2.67 | 3.06 | 1.63 | 0.55 | 1.18 |
| 2 | 14 | 5 | 7 | 303 | 0.06 | 2.13 | 0.80 | 3.06 | 3.19 | 0.55 | -0.62 |
| 2 | 14 | 6 | 10 | 294 | 0.05 | 1.53 | 0.92 | 3.06 | 3.19 | -0.61 | 0.27 |
| 2 | 14 | 7 | 8 | 306 | 0.21 | 3.51 | 0.18 | 3.42 | 1.63 | -0.61 | 0.77 |
| 2 | 14 | 8 | 19 | 316 | 0.14 | 1.99 | 2.67 | 3.06 | 3.11 | -0.43 | 0.06 |
| 2 | 14 | 9 | 5 | 302 | 0.03 | 1.52 | 0.92 | 3.06 | 3.19 | 0.55 | -1.30 |
| 2 | 14 | 10 | 6 | 284 | 0.28 | 1.92 | 0.92 | 4.58 | -0.07 | -0.42 | 0.49 |
| 2 | 14 | 11 | 4 | 298 | 0.02 | 1.24 | 0.92 | 3.06 | 3.19 | 0.55 | -0.84 |
| 2 | 14 | 12 | 3 | 321 | 0.02 | 5.26 | 0.92 | 4.01 | 4.76 | 2.88 | 0.46 |
| 2 | 14 | 13 | 14 | 275 | 0.25 | 1.24 | 0.92 | 3.25 | 1.40 | 0.47 | 0.25 |
| 2 | 14 | 14 | 2 | 297 | 0.05 | 2.89 | 0.92 | 3.06 | 1.63 | -0.61 | -1.37 |
| 2 | 14 | 15 | 5 | 296 | 0.02 | 1.57 | 0.92 | 3.06 | 3.19 | -0.61 | -0.30 |
| 2 | 15 | 1 | 19 | 30 | 0.07 | -0.09 | 0.92 | -1.03 | -0.74 | -0.61 | -2.29 |
| 2 | 15 | 2 | 43 | 11 | 0.08 | -0.29 | -0.80 | -0.85 | -0.72 | -0.61 | -2.30 |
| 2 | 15 | 3 | 36 | 28 | 0.1 | -0.17 | 0.09 | -0.89 | -0.72 | -0.61 | -2.16 |
| 2 | 15 | 4 | 16 | 34 | 0.08 | -0.43 | -0.81 | -0.70 | 0.06 | -0.61 | -2.19 |
| 2 | 15 | 5 | 22 | 71 | 0.16 | 0.20 | -0.66 | -0.65 | -0.15 | 0.55 | -2.32 |
| 2 | 15 | 6 | 14 | 70 | 0.06 | 0.05 | 0.09 | -0.57 | 0.06 | -0.61 | -2.25 |
| 2 | 15 | 7 | 52 | 1 | 0.14 | -1.09 | -1.01 | -1.07 | -0.82 | -0.61 | -2.30 |
| 2 | 15 | 8 | 13 | 8 | 0.18 | -0.71 | -0.84 | -0.81 | -0.78 | 0.55 | -2.14 |
| 2 | 15 | 9 | 16 | 17 | 0.08 | -0.62 | -0.83 | -0.84 | -0.74 | -0.61 | -1.67 |
| 2 | 15 | 10 | 14 | 84 | 0.16 | 0.18 | -0.84 | -0.19 | -0.05 | -0.61 | -1.50 |
| 2 | 15 | 11 | 11 | 3 | 0.12 | -0.82 | 0.39 | -1.27 | -1.11 | -0.61 | -2.16 |
| 2 | 15 | 12 | 29 | 49 | 0.09 | 0.18 | -0.88 | -0.57 | 0.04 | -0.61 | -2.30 |
| 2 | 15 | 13 | 11 | 91 | 0.17 | 0.15 | -0.74 | -0.49 | -0.22 | 0.55 | -1.53 |
| 2 | 15 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 15 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 1 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 3 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 1 | 5 | 190 | 0.13 | 2.01 | 0.92 | 0.80 | -0.25 | -0.61 | 0.12 |
| 3 | 4 | 2 | 5 | 194 | 0.09 | 2.37 | 0.92 | 0.50 | 0.06 | -0.61 | 0.43 |
| 3 | 4 | 3 | 4 | 175 | 0.08 | 2.18 | 0.09 | 0.46 | 0.06 | -0.61 | 0.52 |
| 3 | 4 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 4 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 5 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 6 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 7 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 8 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 9 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 1 | 3 | 189 | 0.09 | 1.90 | -0.20 | 0.45 | 0.06 | 0.55 | 0.05 |
| 3 | 10 | 2 | 18 | 213 | 0.12 | 2.29 | 0.92 | 0.69 | 0.06 | 0.55 | 0.38 |
| 3 | 10 | 3 | 11 | 199 | 0.14 | 2.11 | 0.92 | 0.65 | -0.72 | 0.55 | 0.31 |
| 3 | 10 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 10 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 11 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 12 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 13 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 14 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 15 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 1 | 11 | 255 | 0.13 | 2.33 | 0.92 | 0.37 | 0.06 | 2.88 | 1.06 |
| 3 | 16 | 2 | 6 | 246 | 0.18 | 2.63 | 0.92 | -0.08 | -0.33 | 2.88 | 0.02 |
| 3 | 16 | 3 | 7 | 250 | 0.16 | 2.06 | 0.92 | 0.60 | 0.06 | 2.88 | 0.14 |
| 3 | 16 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 16 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 1 | 3 | 276 | 0.06 | 0.96 | 0.09 | 1.73 | 1.63 | 2.88 | -0.92 |
| 3 | 17 | 2 | 3 | 271 | 0.04 | 0.76 | -0.78 | 1.73 | 1.63 | 2.88 | -0.69 |
| 3 | 17 | 3 | 2 | 281 | 0.02 | 0.83 | -0.78 | 1.73 | 1.63 | 2.88 | -1.30 |
| 3 | 17 | 4 | 2 | 248 | 0.09 | 1.76 | -0.35 | 0.87 | 0.06 | 2.88 | -1.08 |
| 3 | 17 | 5 | 2 | 279 | 0.07 | 2.32 | -0.78 | -0.06 | 1.63 | 2.88 | -1.37 |
| 3 | 17 | 6 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 7 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 8 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 9 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 10 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 11 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 12 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 13 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 14 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 17 | 15 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 18 | 1 | 7 | 203 | 0.1 | 1.31 | 0.09 | -0.08 | 0.06 | 2.88 | 0.85 |
| 3 | 18 | 2 | 5 | 180 | 0.1 | 1.09 | 0.92 | -0.73 | -0.72 | 2.88 | 0.83 |
| 3 | 18 | 3 | 7 | 182 | 0.13 | 1.31 | 0.09 | -0.25 | -0.72 | 2.88 | 0.73 |
| 3 | 18 | 4 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 18 | 5 | 0 | NA | NA | NA | NA | NA | NA | NA | NA |
Let’s check the compression summary to check how many cells in each level are above the quantization error threshold.
compressionSummaryTable(hvt.results3[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 15 | 0 | 0 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
| 2 | 207 | 176 | 0.85 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
| 3 | 99 | 99 | 1 | n_cells: 15 quant.err: 0.2 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
As it can be seen from the compression summary table above, the
Quantization Error for most of the cells in level 3 fall below the
defined quantization threshold. Hence, we were successfully able to
compress 100% of the data.
muHVT::hvtHmap(
hvt.results3,
trainComputers,
child.level = 3,
hmap.cols = "Quant.Error",
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 8: The Voronoi tessellation with the heat map overlaid for variable ’quant_error’ in the ’computers’ dataset
Now we will try to get more insights from the cells by overlaying
heatmap for variable price at different levels.
Let’s do it for level one.
In the plot below, a heatmap for the variable price is
overlayed on a level one tessellation plot. We calculate the mean price
for each cell and represent it as a heatmap.
The heatmap for the price variable for different cells
at level 1 can be seen in the plot below.
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "price",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1.5,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 9: The Voronoi Tessellation with the heat map overlaid for variable ’price’ at level 1 from ’computers’ dataset
Now we will go one level deeper and overlay heatmap for
price at level 2. This should give us better insight about
the price distribution for different cells.
In the plot below, we have overlayed heatmap for the variable
price at level 2.
muHVT::hvtHmap(
hvt.results2,
trainComputers,
child.level = 2,
hmap.cols = "price",
line.width = c(0.4, 0.2),
color.vec = c("#141B41", "#0582CA"),
palette.color = 6,
show.points = T,
centroid.size = 2,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 10: The Voronoi tessellation with the heat map overlaid for the variable ’price’ at level 2 from the ’computer’ dataset
Let us go one level deeper and overlay heatmap for price
at level 3.
In the plot below, we have overlayed heatmap for variable
price on level 3.
muHVT::hvtHmap(
hvt.results3,
trainComputers,
child.level = 3,
hmap.cols = "price",
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 11: The Voronoi tessellation with the heat map overlaid for the variable ’price’ at level 3 from the ’computer’ dataset
Using the exploded_hmap() function, we can also generate
an interactive 3D heatmap for any variable of the data. We can choose
how many HVT Levels are displayed in this plot.
The image below shows what the 3D heatmap for the price
variable would look like for Levels 1 and 2.
exploded_hmap(hvt.results3, child.level = 2, hmap.cols = "price", n_cells.hmap = 15, dim_size = 1000)Let’s repeat the steps above for the speed variable
The heatmap for speed variable for different cells at
level 1 can be seen in the plot below.
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "speed",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1.5,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 12: The Voronoi Tessellation with the heat map overlaid for variable ’speed’ at level 1 from ’computers’ dataset
Now we will go one level deeper and overlay heatmap for
speed at level 2
muHVT::hvtHmap(
hvt.results2,
trainComputers,
child.level = 2,
hmap.cols = "speed",
line.width = c(0.6, 0.2),
color.vec = c("#141B41", "#0582CA"),
palette.color = 6,
show.points = T,
centroid.size = 1.5,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 13: The Voronoi Tessellation with the heat map overlaid for the variable ’speed’ at level 2 from the ’computer’ dataset
Let us go one level deeper and overlay heatmap for speed
at level 3.
In the plot below, we have overlayed heatmap for variable
speed on level 3.
muHVT::hvtHmap(
hvt.results3,
trainComputers,
child.level = 3,
hmap.cols = "speed",
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 14: The Voronoi Tessellation with the heat map overlaid for the variable ’speed’ at level 3 from the ’computer’ dataset
Now once we have built the model, let us try to predict using our test dataset which cell and which level each point belongs to.
predictHVT(data,
hvt.results,
hmap.cols = NULL,
child.level = 1,
...)The important parameters for the function predictHVT are
as below
data - A dataframe containing the
test dataset. The dataframe should have atleast one variable used for
training. The variables from this dataset can also be used to overlay as
heatmap
hvt.results - A list of hvt.results
obtained from the HVT function while performing hierarchical vector
quantization on training data
hmap.cols - The column number of
column name from the dataset indicating the variables for which the heat
map is to be plotted. A heatmap won’t be plotted if NULL is passed
(Default = NULL)
child.level - A number indicating
the level for which the heat map is to be plotted (Only used if
hmap.cols is not NULL)
... - color.vec and line.width can
be passed from here
set.seed(240)
predictions <- muHVT::predictHVT(
testComputers,
hvt.results3,
hmap.cols = "Quant.Error",
child.level = 3,
line.width = c(1.2, 0.8, 0.4),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
quant.error.hmap = 0.2,
n_cells.hmap = 15
)The prediction algorithm recursively calculates the distance between each point in the test dataset and the cell centroids for each level. The following steps explain the prediction method for a single point in the test dataset :
Let’s see which cell and level each point belongs to. For the sake of brevity, we will only show the first 10 rows
predictions[["scoredPredictedData"]] %>% head(100) %>%
round(2) %>%
as.data.frame() %>%
Table(scroll = T, limit = 10)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads | centroidRadius | diff | anomalyFlag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 3 | 9 | 1 | NA | 0 | -1.23 | -0.78 | -0.68 | -0.72 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 1.38 | 0.09 | 3.06 | 3.19 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.80 | 0.09 | -0.68 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 0.23 | 2.67 | -0.41 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 0.31 | 0.92 | 1.73 | 1.63 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.51 | 0.92 | 3.06 | 0.06 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 1.07 | 0.09 | 3.06 | 3.19 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -1.22 | 0.92 | -0.08 | 0.06 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.93 | 0.92 | -0.08 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -1.12 | -0.78 | -0.68 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
We can see the predictions for some of the points in the table above.
The variable cell_path shows us the level and the cell that
each point is mapped to. The centroid of the cell that the point is
mapped to is the codeword (predictor) for that cell.
The prediction algorithm will not work if some of the variables used to perform quantization are missing. Let’s try it out. In the test dataset, we should not remove any features.
set.seed(240)
# testComputers <- testComputers %>% dplyr::select(-c(screen,ads))
predictions <- muHVT::predictHVT(
testComputers,
hvt.results3,
hmap.cols = "Quant.Error",
child.level = 3,
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
centroid_size = 1,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)predictions[["predictPlot"]]predictions[["scoredPredictedData"]] %>% head(100) %>%
round(2) %>%
as.data.frame() %>%
Table(scroll = T, limit = 10)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads | centroidRadius | diff | anomalyFlag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 3 | 9 | 1 | NA | 0 | -1.23 | -0.78 | -0.68 | -0.72 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 1.38 | 0.09 | 3.06 | 3.19 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.80 | 0.09 | -0.68 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 0.23 | 2.67 | -0.41 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 0.31 | 0.92 | 1.73 | 1.63 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.51 | 0.92 | 3.06 | 0.06 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | 1.07 | 0.09 | 3.06 | 3.19 | 0.55 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -1.22 | 0.92 | -0.08 | 0.06 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -0.93 | 0.92 | -0.08 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
| 2 | 3 | 9 | 1 | NA | 0 | -1.12 | -0.78 | -0.68 | -0.72 | -0.61 | -0.84 | NA | NA | 0 |
In this section, we will see how we can use the package to visualize multidimensional data by projecting them to two dimensions using Sammon’s projection.
First of all, let us see how to generate data for torus. We are using
a library geozoo for this purpose. Geo Zoo (stands for
Geometric Zoo) is a compilation of geometric objects ranging from three
to 10 dimensions. Geo Zoo contains regular or well-known objects, eg
cube and sphere, and some abstract objects, e.g. Boy’s surface, Torus
and Hyper-Torus.
Here we will generate a 3D torus with 9000 points.
set.seed(240)
# Here p represents dimension of object
# n represents number of points
torus <- geozoo::torus(p = 3,n = 9000)
torus_df <- data.frame(torus$points)
colnames(torus_df) <- c("x","y","z")Now let’s do some EDA on the data. First of all, we will see what the data looks like
Table(head(torus_df))| x | y | z |
|---|---|---|
| -2.628238 | 0.5655770 | -0.7253285 |
| -1.417917 | -0.8902793 | 0.9454533 |
| -1.030820 | 1.1066495 | -0.8730506 |
| 1.884711 | 0.1894905 | 0.9943888 |
| -1.950608 | -2.2506838 | 0.2070521 |
| -1.482371 | 0.9228529 | 0.9672467 |
Now let’s have a look at summary and structure of the data.
str(torus_df)
#> 'data.frame': 9000 obs. of 3 variables:
#> $ x: num -2.63 -1.42 -1.03 1.88 -1.95 ...
#> $ y: num 0.566 -0.89 1.107 0.189 -2.251 ...
#> $ z: num -0.725 0.945 -0.873 0.994 0.207 ...summary(torus_df)
#> x y z
#> Min. :-2.99767 Min. :-2.999343 Min. :-0.9999999
#> 1st Qu.:-1.15065 1st Qu.:-1.120632 1st Qu.:-0.7130951
#> Median :-0.01899 Median : 0.001856 Median : 0.0033675
#> Mean :-0.00914 Mean : 0.004195 Mean : 0.0001237
#> 3rd Qu.: 1.13001 3rd Qu.: 1.130708 3rd Qu.: 0.7138584
#> Max. : 2.99713 Max. : 2.999308 Max. : 1.0000000Now let’s try to visualize the object in a 3D Space.
#plot_torus <- plotly::plot_ly(torus_df, x= ~x, y= ~y, z = ~z, color = ~z) %>% add_markers()
#plot_torus
knitr::include_graphics('torus.png')Figure 15: 3D Torus
Now let’s try to use the package and project the above 3D object in a 2D Space. We will start with number of cells as 100.
set.seed(240)
hvt.torus <- muHVT::HVT(
torus_df,
n_cells = 100,
depth = 1,
quant.err = 0.06,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)
muHVT::plotHVT(
hvt.torus,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.8,
maxDepth = 1
)Figure 16: The Voronoi tessellation for level 1 shown for the 100 cells in the dataset ’torus’
Let’s checkout the compression summary for torus.
compressionSummaryTable(hvt.torus[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 100 | 0 | 0 | n_cells: 100 quant.err: 0.06 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
As it can be seen in the table above, none of the 100 cells hit the quantization threshold error.
Let’s overlay the heatmap for quantization error for level 1.
muHVT::hvtHmap(
hvt.torus,
torus_df,
child.level = 1,
hmap.cols = "Quant.Error",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 17: The Voronoi Tessellation for level 1 with the heat map overlaid for variable ’quant_error’ in the ’torus’ dataset
Now let’s overlay the heatmap for x, y and z variables
muHVT::hvtHmap(
hvt.torus,
torus_df,
child.level = 1,
hmap.cols = "x",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 18: The Voronoi Tessellation for level 1 with the heat map overlaid for variable ’x’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus,
torus_df,
child.level = 1,
hmap.cols = "y",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 19: The Voronoi Tessellation for level 1 with the heat map overlaid for variable ’y’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus,
torus_df,
child.level = 1,
hmap.cols = "z",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 20: The Voronoi Tessellation for level 1 with the heat map overlaid for variable ’z’ in the ’torus’ dataset
Now let’s try again with the number of cells as 300.
set.seed(240)
hvt.torus2 <- muHVT::HVT(
torus_df,
n_cells = 300,
depth = 1,
quant.err = 0.06,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)
muHVT::plotHVT(
hvt.torus2,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.8,
maxDepth = 1
)Figure 21: The Voronoi tessellation for level 1 shown for the 200 cells in the dataset ’torus’
Let’s checkout the compression summary for torus.
compressionSummaryTable(hvt.torus2[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 300 | 62 | 0.21 | n_cells: 300 quant.err: 0.06 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
It can be observed from the table above that only 62 cells out of 300
i.e. 21% of the cells hit the Quantization Error
threshold.
Let’s understand this visually by overlaying the heatmap for Quantization Error at level 1.
muHVT::hvtHmap(
hvt.torus2,
torus_df,
child.level = 1,
hmap.cols = "Quant.Error",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 1,
show.points = T,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 22: The Voronoi tessellation for level 2 with the heat map overlaid for variable ’quant_error’ in the ’torus’ dataset
Now lets overlay the heatmap for the x, y and z variables at level 1
muHVT::hvtHmap(
hvt.torus2,
torus_df,
child.level = 1,
hmap.cols = "x",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 1,
show.points = T,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 23: The Voronoi tessellation for level 2 with the heat map overlaid for variable ’x’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus2,
torus_df,
child.level = 1,
hmap.cols = "y",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 1,
show.points = T,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 24: The Voronoi tessellation for level 2 with the heat map overlaid for variable ’y’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus2,
torus_df,
child.level = 1,
hmap.cols = "z",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 1,
show.points = T,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 25: The Voronoi tessellation for level 2 with the heat map overlaid for variable ’z’ in the ’torus’ dataset
Let’s increase the number of cells to 900.
set.seed(240)
hvt.torus3 <- muHVT::HVT(
torus_df,
n_cells = 900,
depth = 1,
quant.err = 0.06,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)
muHVT::plotHVT(
hvt.torus3,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.6,
maxDepth = 1
)Figure 26: The Voronoi tessellation for level 1 shown for the 500 cells in the dataset ’torus’
Let’s check the compression summary for torus.
compressionSummaryTable(hvt.torus3[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 900 | 856 | 0.95 | n_cells: 900 quant.err: 0.06 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
By increasing the number of cells to 900, we were successfully able
to compress 95% of the data.
Let’s checkout the heatmap for quantization error.
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "Quant.Error",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 0.8,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 27: The Voronoi tessellation with the heat map overlaid for variable ’quant_error’ in the ’torus’ dataset
Now let’s checkout the heatmap for the x, y and z variables
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "x",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 0.8,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 28: The Voronoi tessellation with the heat map overlaid for variable ’x’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "y",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 0.8,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 29: The Voronoi tessellation with the heat map overlaid for variable ’y’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "z",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
show.points = T,
centroid.size = 0.8,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 30: The Voronoi tessellation with the heat map overlaid for variable ’z’ in the ’torus’ dataset
Let’s use the hierarchical Vector Quantization technique and go one level deeper, keeping the number of cells as 500.
set.seed(240)
hvt.torus4 <- muHVT::HVT(
torus_df,
n_cells = 500,
depth = 2,
quant.err = 0.06,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "mean",
quant_method = "kmeans"
)
muHVT::plotHVT(
hvt.torus4,
line.width = c(0.5, 0.3),
color.vec = c("#141B41", "#0582CA"),
centroid.size = 1,
maxDepth = 2
)Figure 31: The Voronoi tessellation for level 2 shown for the 400 cells in the dataset ’torus’
Let’s check the compression summary for torus.
compressionSummaryTable(hvt.torus4[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 500 | 346 | 0.69 | n_cells: 500 quant.err: 0.06 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
| 2 | 468 | 468 | 1 | n_cells: 500 quant.err: 0.06 distance_metric: L1_Norm error_metric: mean quant_method: kmeans |
From the above table, we can observe that we were able to compress
100% of the data using hierarchical Vector
Quantization.
Let’s also observe the Quantization Error heatmap.
muHVT::hvtHmap(
hvt.torus4,
torus_df,
child.level = 2,
hmap.cols = "Quant.Error",
line.width = c(0.6, 0.3),
color.vec = c("#141B41", "#6369D1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 32: The Voronoi tessellation with the heat map overlaid for variable ’quant_error’ in the ’torus’ dataset
Now lets observe the heatmaps for x, y and z variables at Level 2
muHVT::hvtHmap(
hvt.torus4,
torus_df,
child.level = 2,
hmap.cols = "x",
line.width = c(0.6, 0.3),
color.vec = c("#141B41", "#6369D1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 33: The Voronoi tessellation with the heat map overlaid for variable ’x’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus4,
torus_df,
child.level = 2,
hmap.cols = "y",
line.width = c(0.6, 0.3),
color.vec = c("#141B41", "#6369D1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 34: The Voronoi tessellation with the heat map overlaid for variable ’y’ in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus4,
torus_df,
child.level = 2,
hmap.cols = "z",
line.width = c(0.6, 0.3),
color.vec = c("#141B41", "#6369D1"),
palette.color = 6,
show.points = T,
centroid.size = 1,
quant.error.hmap = 0.06,
n_cells.hmap = 15
)Figure 35: The Voronoi tessellation with the heat map overlaid for variable ’z’ in the ’torus’ dataset
Pricing Segmentation - The package can be used to discover groups of similar customers based on the customer spend pattern and understand price sensitivity of customers
Market Segmentation - The package can be helpful in market segmentation where we have to identify micro and macro segments. The method used in this package can do both kinds of segmentation in one go
Anomaly Detection - This method can help us categorize system behavior over time and help us find anomaly when there are changes in the system. For e.g. Finding fraudulent claims in healthcare insurance
The package can help us understand the underlying structure of the data. Suppose we want to analyze a curved surface such as sphere or vase, we can approximate it by a lot of small low-order polygons in the form of tessellations using this package
In biology, Voronoi diagrams are used to model a number of different biological structures, including cells and bone microarchitecture
Using the base idea of Systems Dynamics, these diagrams can also be used to depict customer state changes over a period of time
Topology Preserving Maps : https://link.springer.com/chapter/10.1007/1-84628-118-0_7
Vector Quantization : https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-450-principles-of-digital-communications-i-fall-2006/lecture-notes/book_3.pdf
Sammon’s Projection : http://en.wikipedia.org/wiki/Sammon_mapping
Voronoi Tessellations : http://en.wikipedia.org/wiki/Centroidal_Voronoi_tessellation